224 research outputs found
Non-oblivious Strategy Improvement
We study strategy improvement algorithms for mean-payoff and parity games. We
describe a structural property of these games, and we show that these
structures can affect the behaviour of strategy improvement. We show how
awareness of these structures can be used to accelerate strategy improvement
algorithms. We call our algorithms non-oblivious because they remember
properties of the game that they have discovered in previous iterations. We
show that non-oblivious strategy improvement algorithms perform well on
examples that are known to be hard for oblivious strategy improvement. Hence,
we argue that previous strategy improvement algorithms fail because they ignore
the structural properties of the game that they are solving
The Complexity of All-switches Strategy Improvement
Strategy improvement is a widely-used and well-studied class of algorithms
for solving graph-based infinite games. These algorithms are parameterized by a
switching rule, and one of the most natural rules is "all switches" which
switches as many edges as possible in each iteration. Continuing a recent line
of work, we study all-switches strategy improvement from the perspective of
computational complexity. We consider two natural decision problems, both of
which have as input a game , a starting strategy , and an edge . The
problems are: 1.) The edge switch problem, namely, is the edge ever
switched by all-switches strategy improvement when it is started from on
game ? 2.) The optimal strategy problem, namely, is the edge used in the
final strategy that is found by strategy improvement when it is started from
on game ? We show -completeness of the edge switch
problem and optimal strategy problem for the following settings: Parity games
with the discrete strategy improvement algorithm of V\"oge and Jurdzi\'nski;
mean-payoff games with the gain-bias algorithm [14,37]; and discounted-payoff
games and simple stochastic games with their standard strategy improvement
algorithms. We also show -completeness of an analogous problem
to edge switch for the bottom-antipodal algorithm for finding the sink of an
Acyclic Unique Sink Orientation on a cube
Time and Parallelizability Results for Parity Games with Bounded Tree and DAG Width
Parity games are a much researched class of games in NP intersect CoNP that
are not known to be in P. Consequently, researchers have considered specialised
algorithms for the case where certain graph parameters are small. In this
paper, we study parity games on graphs with bounded treewidth, and graphs with
bounded DAG width. We show that parity games with bounded DAG width can be
solved in O(n^(k+3) k^(k + 2) (d + 1)^(3k + 2)) time, where n, k, and d are the
size, treewidth, and number of priorities in the parity game. This is an
improvement over the previous best algorithm, given by Berwanger et al., which
runs in n^O(k^2) time. We also show that, if a tree decomposition is provided,
then parity games with bounded treewidth can be solved in O(n k^(k + 5) (d +
1)^(3k + 5)) time. This improves over previous best algorithm, given by
Obdrzalek, which runs in O(n d^(2(k+1)^2)) time. Our techniques can also be
adapted to show that the problem of solving parity games with bounded treewidth
lies in the complexity class NC^2, which is the class of problems that can be
efficiently parallelized. This is in stark contrast to the general parity game
problem, which is known to be P-hard, and thus unlikely to be contained in NC
Bounded Satisfiability for PCTL
While model checking PCTL for Markov chains is decidable in polynomial-time,
the decidability of PCTL satisfiability, as well as its finite model property,
are long standing open problems. While general satisfiability is an intriguing
challenge from a purely theoretical point of view, we argue that general
solutions would not be of interest to practitioners: such solutions could be
too big to be implementable or even infinite. Inspired by bounded synthesis
techniques, we turn to the more applied problem of seeking models of a bounded
size: we restrict our search to implementable -- and therefore reasonably
simple -- models. We propose a procedure to decide whether or not a given PCTL
formula has an implementable model by reducing it to an SMT problem. We have
implemented our techniques and found that they can be applied to the practical
problem of sanity checking -- a procedure that allows a system designer to
check whether their formula has an unexpectedly small model
Computing Approximate Nash Equilibria in Polymatrix Games
In an -Nash equilibrium, a player can gain at most by
unilaterally changing his behaviour. For two-player (bimatrix) games with
payoffs in , the best-known achievable in polynomial time is
0.3393. In general, for -player games an -Nash equilibrium can be
computed in polynomial time for an that is an increasing function of
but does not depend on the number of strategies of the players. For
three-player and four-player games the corresponding values of are
0.6022 and 0.7153, respectively. Polymatrix games are a restriction of general
-player games where a player's payoff is the sum of payoffs from a number of
bimatrix games. There exists a very small but constant such that
computing an -Nash equilibrium of a polymatrix game is \PPAD-hard.
Our main result is that a -Nash equilibrium of an -player
polymatrix game can be computed in time polynomial in the input size and
. Inspired by the algorithm of Tsaknakis and Spirakis, our
algorithm uses gradient descent on the maximum regret of the players. We also
show that this algorithm can be applied to efficiently find a
-Nash equilibrium in a two-player Bayesian game
Strategy iteration algorithms for games and Markov decision processes
In this thesis, we consider the problem of solving two player infinite games,
such as parity games, mean-payoff games, and discounted games, the problem of
solving Markov decision processes. We study a specific type of algorithm for solving
these problems that we call strategy iteration algorithms. Strategy improvement
algorithms are an example of a type of algorithm that falls under this classification.
We also study Lemke’s algorithm and the Cottle-Dantzig algorithm, which
are classical pivoting algorithms for solving the linear complementarity problem.
The reduction of Jurdzinski and Savani from discounted games to LCPs allows these
algorithms to be applied to infinite games [JS08]. We show that, when they are
applied to games, these algorithms can be viewed as strategy iteration algorithms.
We also resolve the question of their running time on these games by providing a
family of examples upon which these algorithm take exponential time.
Greedy strategy improvement is a natural variation of strategy improvement,
and Friedmann has recently shown an exponential lower bound for this algorithm
when it is applied to infinite games [Fri09]. However, these lower bounds do not
apply for Markov decision processes. We extend Friedmann’s work in order to prove
an exponential lower bound for greedy strategy improvement in the MDP setting.
We also study variations on strategy improvement for infinite games. We
show that there are structures in these games that current strategy improvement
algorithms do not take advantage of. We also show that lower bounds given by
Friedmann [Fri09], and those that are based on his work [FHZ10], work because they
exploit this ignorance. We use our insight to design strategy improvement algorithms
that avoid poor performance caused by the structures that these examples use
Distributed Methods for Computing Approximate Equilibria
We present a new, distributed method to compute approximate Nash equilibria
in bimatrix games. In contrast to previous approaches that analyze the two
payoff matrices at the same time (for example, by solving a single LP that
combines the two players payoffs), our algorithm first solves two independent
LPs, each of which is derived from one of the two payoff matrices, and then
compute approximate Nash equilibria using only limited communication between
the players.
Our method has several applications for improved bounds for efficient
computations of approximate Nash equilibria in bimatrix games. First, it yields
a best polynomial-time algorithm for computing \emph{approximate well-supported
Nash equilibria (WSNE)}, which guarantees to find a 0.6528-WSNE in polynomial
time. Furthermore, since our algorithm solves the two LPs separately, it can be
used to improve upon the best known algorithms in the limited communication
setting: the algorithm can be implemented to obtain a randomized
expected-polynomial-time algorithm that uses poly-logarithmic communication and
finds a 0.6528-WSNE. The algorithm can also be carried out to beat the best
known bound in the query complexity setting, requiring payoff
queries to compute a 0.6528-WSNE. Finally, our approach can also be adapted to
provide the best known communication efficient algorithm for computing
\emph{approximate Nash equilibria}: it uses poly-logarithmic communication to
find a 0.382-approximate Nash equilibrium
- …